首页> 外文OA文献 >All You Need is Beyond a Good Init: Exploring Better Solution for Training Extremely Deep Convolutional Neural Networks with Orthonormality and Modulation
【2h】

All You Need is Beyond a Good Init: Exploring Better Solution for Training Extremely Deep Convolutional Neural Networks with Orthonormality and Modulation

机译:你需要的只是一个好的初始:探索更好的解决方案   用正交和训正训练极深度卷积神经网络   调制

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Deep neural network is difficult to train and this predicament becomes worseas the depth increases. The essence of this problem exists in the magnitude ofbackpropagated errors that will result in gradient vanishing or explodingphenomenon. We show that a variant of regularizer which utilizes orthonormalityamong different filter banks can alleviate this problem. Moreover, we design abackward error modulation mechanism based on the quasi-isometry assumptionbetween two consecutive parametric layers. Equipped with these two ingredients,we propose several novel optimization solutions that can be utilized fortraining a specific-structured (repetitively triple modules of Conv-BNReLU)extremely deep convolutional neural network (CNN) WITHOUT any shortcuts/identity mappings from scratch. Experiments show that our proposed solutionscan achieve distinct improvements for a 44-layer and a 110-layer plain networkson both the CIFAR-10 and ImageNet datasets. Moreover, we can successfully trainplain CNNs to match the performance of the residual counterparts. Besides, we propose new principles for designing network structure from theinsights evoked by orthonormality. Combined with residual structure, we achievecomparative performance on the ImageNet dataset.
机译:深度神经网络很难训练,并且随着深度的增加,这种困境变得更加严重。该问题的本质在于反向传播误差的大小,该误差将导致梯度消失或爆炸现象。我们表明,利用不同滤波器组之间的正交性的正则化器的变体可以缓解此问题。此外,我们基于两个连续参数层之间的准等距假设设计了一个后向误差调制机制。装备了这两种成分,我们提出了几种新颖的优化解决方案,可用于训练特定结构的(重复的Conv-BNReLU三重模块)极深的卷积神经网络(CNN),而无需从头开始进行任何捷径/身份映射。实验表明,我们提出的解决方案可以在CIFAR-10和ImageNet数据集上针对44层和110层的普通网络实现明显的改进。此外,我们可以成功训练普通CNN来匹配剩余对应项的性能。此外,从正交性的观点出发,我们提出了设计网络结构的新原理。结合残差结构,我们在ImageNet数据集上实现了可比的性能。

著录项

相似文献

  • 外文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号